Incremental Data Mining Using Concurrent Online Refresh of Materialized Data Mining Views
نویسندگان
چکیده
Data mining is an iterative process. Users issue series of similar data mining queries, in each consecutive run slightly modifying either the definition of the mined dataset, or the parameters of the mining algorithm. This model of processing is most suitable for incremental mining algorithms that reuse the results of previous queries when answering a given query. Incremental mining algorithms require the results of previous queries to be available. One way to preserve those results is to use materialized data mining views. Materialized data mining views store the mined patterns and refresh them as the underlying data change. Data mining and knowledge discovery often take place in a data warehouse environment. There can be many relatively small materialized data mining views defined over the data warehouse. Separate refresh of each materialized view can be expensive, if the refresh process has to re-discover patterns in the original database. In this paper we present a novel approach to materialized data mining view refresh process. We show that the concurrent on-line refresh of a set of materialized data mining views is more efficient than the sequential refresh of individual views. We present the framework for the integration of data warehouse refresh process with the maintenance of materialized data mining views. Finally, we prove the feasibility of our approach by conducting several experiments on synthetic data sets.
منابع مشابه
Incremental Association Rule Mining Using Materialized Data Mining Views
Data mining is an interactive and iterative process. Users issue series of similar queries until they receive satisfying results, yet currently available data mining systems do not support iterative processing of data mining queries and do not allow to re-use the results of previous queries. Consequently, mining algorithms suffer from long processing times, which are unacceptable from the point...
متن کاملAn Architecture of a Data
We present incremental view maintenance algorithms for a data warehouse derived from multiple distributed autonomous data sources. We begin with a detailed framework for analyzing view maintenance algorithms for multiple data sources with concurrent updates. Earlier approaches for view maintenance in the presence of concurrent updates typically require two types of messages: one to compute the ...
متن کاملMaterialized Data Mining Views
Data mining is a useful decision support technique, which can be used to find trends and regularities in warehouses of corporate data. A serious problem of its practical applications is long processing time required by data mining algorithms. Current systems consume minutes or hours to answer simple queries. In this paper we present the concept of materialized data mining views. Materialized da...
متن کاملFast Discovery of Sequential Patterns Using Materialized Data Mining Views
Most data mining techniques consist in discovery of frequently occurring patterns in large data sets. From a user’s point of view, data mining can be seen as advanced querying, where each data mining query specifies the source data set and the requested class of patterns. Unfortunately, current data mining systems consume minutes or hours to answer simple queries, which makes them unsuitable fo...
متن کاملData Mining: Concepts and Techniques, 3rd edition
There are a good number of introductory-level textbooks on data warehousing and OLAP technology, including Kimball and Ross [KR02], Imhoff, Galemmo and Geiger [IGG03], Inmon [Inm96], Berson and Smith [BS97], and Thomsen [Tho97]. Chaudhuri and Dayal [CD97] provide a general overview of data warehousing and OLAP technology. A set of research papers on materialized views and data warehouse impleme...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005